AO-A096  366 
UNCLASSIFIED 


CARNE6 I E-MELLON  UNIV  PITTSBURGH  PA  DEPT  OF  COMPUTER  —ETC  F/G  9/5 
A  LAYOUT  FOR  THE  SHUFFLE-EXCHANGE  NETWORK. (U) 

AU6  00  D  HOEY.  C  E  LEISERSON 
CMU-CS-00-139 


N00014-76— C-0370 
NL 


AD  A 0963  6  8 


— mmmmmm 


CMU-CS-80-139 


-  A  Layout  for  the 
Shuffle-Exchange  Network. 

Dan/Hoey 

Charles  Ej'Leiserson 

Department  of  Computer  Science 
Carnegie-Mellon  University 
Pittsburgh,  Pennsylvania  15213 


77'! 


18  August^tDSO 


I  1  / 


\ 


Abstract 


This  paper  describes  a  technique  for  producing  a  VLSI  layout  of  the  shuffle-exchange  graph.  It  is  based  on 
the  layout  procedure  in-f2fwhich  lays  out  a  graph  by  bisecting  the  graph,  recursively  laying  out  the  two 
halves,  and  then  combining  the  two  sublayouts.  The  area  of  the  layout  is  related  to  the  number  of  edges  that 
must  be  cut  to  bisect  the  graph.  ,  k.  ,  ■  ,,r  r 

For  the  shuffle-exchange  graph  on  n  vertices*  we  present  a  bisection  schema  for  which  the  above  procedure 
yields  an  0(ij}/  lg  n)  area  layout  when  n  and  k  is  a  power  of  two.  lire  bisection  involves  a  mapping  from 
vertices  of  the  graph  to  polynomials,  and  the  polynomials  are  subsequently  evaluated  at  complex  roots  of 
unity.  Incidental  to  this  construction  is  a  result  on  the  combinatorial  problem  of  necklace  enumeration. 
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Addendum 

The  construction  described  in  Section  6  requires  0(n7  /  lg  ;i)  area  to  lay  out  the  shuffle-exchange  graph  on 
n  =  2*  vertices  when  k  is  a  power  of  two,  but  this  technique  cannot  be  used  for  other  values  of  k.  Here  we 
provide  a  construction  of  an  0(tt2/ lg  n)  area  layout  for  all  values  of  k.  Hi  is  improved  construction  docs  not 
use  the  results  of  Sections  5  and  6,  but  follows  directly  from  the  mapping  v  •-*  pv(u)  described  in  Section  4. 

The  idea  behind  the  construction  is  simple.  Figure  3  shows  a  layout  of  the  shuffle-exchange  graph  on  64 
vertices.  Kach  exchange  edge  is  allocated  a  constant-width  track  parallel  to  the  x-axis.  Fateh  necklace  is 
allocated  two  tracks  parallel  to  the  y-axis.  'Hie  shuffle  edges  in  the  necklace  run  up  one  track  and  down  the 
other.  Since  there  arc  0(n/ lg  n)  necklaces  and  0(«)  exchange  edges,  the  area  of  the  layout  is  Q(h2  /lg  w). 

For  this  construction,  the  assignment  of  x-coordinatcs  to  necklaces  is  arbitrary,  but  the  ^-coordinates  of  the 
vertices  must  satisfy  two  constraints.  First,  ^-coordinates  of  vertices  connected  by  an  exchange  edge  must  be 
equal  because  exchange  edges  run  parallel  to  the  x-axis.  Second,  the  ^-coordinates  of  die  vertices 

v,  <x(v),  ct(o(v)) . a"'(v)  in  a  given  necklace  must  form  a  bitonic  sequence  [8],  that  is,  the  coordinates  must 

consist  of  an  increasing  sequence  and  a  decreasing  sequence.  ITiis  condition  guarantees  that  two  tracks  suffice 
to  route  the  shuffle  edges. 

In  order  to  assign  y-coordinatcs  in  this  way,  recall  the  mapping  v  />v(to)  from  Section  3.  First,  vertices 
connected  by  exchange  edges  arc  mapped  to  complex  numbers  widi  the  same  imaginary  component.  Second, 
the  vertices  in  each  necklace,  if  not  all  mapped  to  the  origin,  arc  mapped  to  complex  numbers  in  order  on  a 
circle,  which  implies  that  the  imaginary  parts  of  these  complex  numbers  form  a  bitonic  sequence,  lhcrcfore, 
an  assignment  of  ycoordi nates  in  die  same  order  as  lm(  pv(w»,  breaking  tics  arbitrarily,  is  consistent  with  die 
two  requirements  listed  above. 

Vertices  mapped  to  the  origin  must  also  be  assigned  y-coordinatcs  in  a  bitonic  sequence.  To  do  diis,  notice 
that  ties  can  be  broken  arbitrarily  for  those  vertices  which  are  mapped  to  the  non/cro  real  line.  For  any 

necklace  (v.  <r(v),  a(a(v)) . a'\v))  mapped  to  the  origin,  simply  break  ties  among  e(v),  e(o(v)), 

e(a(a(v))) . e(o~’(v))  so  that  the  y-coordinatcs  of  the  vertices  in  die  necklace  form  a  bitomc  sequence. 

This  completes  the  construction. 

lhc  techniques  of  Sections  5  and  6  may  still  be  of  value  despite  this  improved  layout.  The  results  of 
Section  5  have  applications  beyond  the  area  of  VI.SI  layouts  |4J,  and  a  better  layout  may  yet  be  obtained  using 
those  methods.  It  is  possible,  however,  that  die  layout  presented  in  this  addendum  may  be  improved  in  odicr 
ways.  For  example,  controlling  the  assignment  of  x-coordinatcs  to  necklaces  might  allow  several  exchange 
edges  to  occupy  different  parts  of  die  same  track. 


Figure  3:  A  layout  of  the  shuffle-exchange  graph  on  64  vertices. 
Vertical  loops  represent  shuffle  edges;  horizontal  lines  represent 


exchange  edges. 
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1.  Introduction 

The  shuffle-exchange  network  has  been  shown  to  be  an  important  communications  structure  for  parallel 
processors.  Stone  [8]  describes  algorithms  which  use  this  structure  to  solve  several  problems,  including  the 
computation  of  the  discrete  Fourier  transform  and  sorting  bitonic  sequences.  The  number  of  communi¬ 
cations  steps  required  by  these  algorithms  is  typically  a  polynomial  in  the  logarithm  of  the  number  of  nodes  in 
the  network,  and  the  nodes  themselves  need  only  perform  relatively  simple  operations. 

VLSI  designers  often  try  to  minimize  the  area  used  by  a  circuit  subject  to  the  requirements  imposed  by  the 
fabrication  technology  on  the  minimum  feature  sizes  of  the  components  [5].  In  [9]  Thompson  develops  lower 
bounds  on  the  growth  of  circuit  area  based  on  graph-theoretic  properties  of  the  communications  structure. 
He  shows  in  particular  that  any  layout  of  the  shuffle-exchange  network  on  n  »  2*  vertices  must  use  at  least 
fl(n2/k2 )  area.  The  arguments  for  Thompson’s  lower  bounds  are  based  on  the  minimum  bisection  width  of  a 
graph,  which  is  the  least  number  of  edges  that  must  be  removed  to  separate  the  graph  into  two  equal-sized 
subgraphs. 

The  concept  of  bisection  width  was  extended  by  Lipton  and  Tarjan  [3]  to  that  of  a  separator  theorem  for  a 
class  of  graphs  closed  under  the  subgraph  relation.  In  essence,  a  separator  theorem  for  a  class  provides  upper 
bounds  on  the  bisection  widths  of  graphs  in  the  class.  Separator  theorems  allow  the  dividc-and-conquer 
paradigm  to  be  exploited  in  the  design  efficient  algorithms  for  graph  manipulation  |4].  Recently,  I-ciscrson  (2) 
has  used  this  approach  to  design  area-efficient  VLSI  layouts. 

In  this  paper  a  theorem  similar  to  a  separator  theorem  is  proven  for  the  shuffle-exchange  graph  on  n  -  2* 
vertices.  We  exhibit  a  dissection  that  shows  how  the  shuffle-exchange  graph  may  be  bisected,  how  the 
resultant  subgraphs  may  themselves  be  bisected,  and  so  forth.  We  use  this  result  to  construct  an  0(n7/k)  area 
layout  for  the  case  when  A:  is  a  power  of  two,  thereby  improving  Thompson's  upper  bound  of  0(n2/  vT).  In 
our  proof  the  vertices  of  the  shuffle-exchange  graph  are  mapped  to  a  polynomial  space,  and  then  the 
polynomials  are  mapped  to  the  complex  plane.  This  construction  also  provides  an  asymptotic  result  on  the 
combinatorial  problem  of  necklace  enumeration. 

The  next  section  formalizes  the  notions  of  bisection  and  dissection.  Section  3  introduces  the  shuffle- 
exchange  graph  and  describes  its  relationship  to  polynomials.  In  Section  4  we  construct  a  bisection  of  the 
shuffle-exchange  graph  whose  width  is  O (n/k),  and  in  Section  5  we  extend  this  result  to  produce  a  dissection. 
In  Section  6.  the  layout  algorithm  of  [2j  is  applied  to  this  dissection  to  produce  an  0(n7/ k)  area  layout  for  the 
shuffle-exchange  graph.  Section  7  concludes  by  comparing  this  result  with  other  work  in  the  field. 
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2.  Graph  Dissection 


In  this  section,  we  formalize  concepts  pertaining  to  the  partitioning  of  a  graph  into  smaller  graphs  by  the 
removal  of  edges. 

A  bisection  S  of  a  graph  (7-  (V,  E)  into  graphs  G'  =  (V\  E')  and  G"  =  ( V",  E")  is  a  disjoint  partition  of 
the  vertices  V=  V'  u  V"  together  with  a  disjoint  partition  of  the  edges  E  -  £'  u  E"  U  Es  such  that  the 
cardinalities  of  V'  and  V"  differ  by  at  most  one.  The  cardinality  of  Es  is  called  the  width  of  the  bisection,  and 
the  edges  in  Es  are  said  to  be  removed  by  the  bisection.  The  graphs  G'  and  G"  are  called  the  halves  of  the 
bisection. 

Of  course,  any  graph  can  be  bisected  by  remo. .. all  its  edges,  but  usually  we  are  interested  in  removing  as 
few  edges  as  possible.  The  minimum  bisection  width  of  a  graph  is  the  smallest  number  of  edges  that  must  be 
removed  to  divide  an  /i-vertex  graph  into  a  f«/2] -vertex  graph  *od  a  [n/2J -vertex  graph.  Unfortunately,  the 
problem  of  finding  the  minimum  bisection  width  of  an  arbitrary  graph  is  NP-complete  [1]. 

It  is  sometimes  the  case  that  every  graph  in  a  class  of  graphs  can  be  bisected  by  the  same  general 
mechanism.  We  define  a  separator  for  a  class  (j  of  graphs  to  be  a  family  If  of  bisections  such  that  If  contains  a 
bisection  of  every  nontrivial  graph  G  in  Q.  Interesting  separators  arc  those  that  exhibit  the  closure  property.  A 
separator  ‘J  for  a  class  of  graphs  has  this  property  if  for  any  graph  Gc  Q,  the  halves  G'  and  G"  that  are 
produced  by  a  bisection  of  G  in  f  are  also  in  Q.  Any  separator  with  the  closure  property  whose  associated 
class  contains  a  particular  graph  G  is  called  a  dissection  of  G. 

A  dissection  If  of  G  may  be  thought  of  as  a  complete  binary  tree  that  has  G  at  the  root,  the  halves  of  G  from 
some  bisection  in  If  as  its  sons,  and  the  halves  of  the  halves  as  grandsons,  and  so  forth  to  trivial  graphs  at  the 
leaves.  If  G  has  n  vertices,  then  the  subgraphs  at  level  j  will  have  about  n/2>  vertices.  Although  there  may  be 
other  graphs  in  the  class  Q  associated  with  f,  at  the  very  least  Q  must  contain  all  of  the  graphs  in  the  tree. 

In  [3J  Tipton  and  Tatjan  introduce  separator  theorems  which  use  ideas  similar  to  those  presented  here.  In 
their  work,  however,  the  discussion  is  restricted  to  classes  of  graphs  that  are  closed  under  the  subgraph 
relation.  (A  class  Q  is  closed  under  the  subgraph  relation  if  every  subgraph  G'  of  a  graph  G  e  Q  is  also  an 
element  of  Q.)  We  have  departed  from  their  approach  because  the  results  of  this  paper  rely  on  properties  of 
the  shuffle-exchange  graph  that  do  not  hold  for  all  of  its  subgraphs. 
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3.  The  Shuffle-Exchange  Graph 

The  shuffle-exchange  graph  on  n  vertices  is  defined  only  when  n  is  a  power  of  two.  Each  of  the  n  «  2* 
vertices  can  be  identified  with  an  element  of  the  Cartesian  product 

{0,1}*  -  .b0\b^  {0. D }. 

Each  vertex  v  €  {0, 1}*  is  incident  on  an  exchange  edge  ( v,  e(v))  and  two  shuffle  edges  ( v,  a(v))  and  (v,  a‘'(v)), 
where  e  and  a  are  permutations  defined  by 


£<6*-lV 2-  ■ 

■  b\  b()  “  V 1  *4-2  •  ' 

•  Mi-V- 

(1) 

ff(  Vl  bk-2  ■  ■ 

■bibt)  m  bk-ibk-r  • 

■blb0bk-l- 

(2) 

In  the  literature  the  vertices  are  usually  identified  with  integers  from  zero  to  n-1  represented  in  binary 
notation.  The  shuffle  permutation  a  is  then  the  permutation  applied  to  a  deck  of  n  cards  by  a  perfect  riffle 
shuffle,  in  which  case  o(m)  &  2m  (mod  n-1).  The  exchange  permutation  e  is  the  permutation  that  exchanges 
pairs  of  adjacent  elements  of  the  vertex  set,  so  that  e(m)  -  m±l. 

The  shuffle-exchange  graph  is  highly  structured  because  of  the  shuffle  permutation.  From  equation  (2)  we 
see  that  a(v)  can  be  determined  from  v  by  rotating  the  indices  of  v  to  the  left  one  position.  The  shuffle 
permutation  partitions  {0, 1}*  into  equivalence  classes  known  as  necklaces  [7],  where  two  vertices  are 
equivalent  whenever  the  indices  of  one  arc  a  cyclic  permutation  of  the  indices  of  the  other.  Since  rotation  by 
k  positions  yields  the  original  vertex,  the  cardinality  of  a  necklace  cannot  exceed  k. 

The  properties  that  we  shall  use  to  dissect  the  shuffle-exchange  graph  are  expressed  conveniently  in  terms 
of  the  characteristic  polynomial,  which  is  defined  for  a  vertex  v  -  bk_ j .  .  .  bQ  €  {0, 1}*  as 

Pv(x)  -  12  bjxK  (3) 

It  should  be  apparent  that  py(2)  is  precisely  the  vector  v  considered  as  a  binary  number,  as  discussed  above. 
The  following  lemma  shows  the  relationship  between  the  characteristic  polynomial  and  the  shuffle  and 
exchange  permutations. 

Lemma  1:  For  all  v  €  {0, 1}*, 


Pt,v)W  -  Py(x)  ±  !. 

(4) 

paU)(x)  m  xpfx)  (mod  jt*-1). 

(5) 

where  the  congruence  (5)  is  taken  over  the  ring  I\x]  of  polynomials  with  integer  coefficients. 


Proof.  From  the  defining  equations  (1)  and  (2), 
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Pv(jf)  -  Pt(v)W  “  b^-d-bjx0  -  2  60-l, 

xpv(x)-paJx)  -  6*.j  **  -  Vi  *°  - 
The  lemma  follows  from  the  fact  that  each  b.  is  either  zero  or  one.  □ 

The  cvrlic  structure  of  necklaces  is  exploited  in  Section  4  to  bisect  the  shuffle-exchange  graph.  This  is  done 
in  such  a  way  that  most  of  the  necklaces  in  the  graph  are  bisected.  When  the  number  of  vertices  in  a  necklace 
is  even,  it  turns  out  that  the  half-necklaces  also  have  a  cyclic  structure.  An  tn-cycle  is  defined  to  be  an  ordered 
sequence  (v0,  Vj . vm  l)  of  m  distinct  vertices  such  that  for  j  =  1,  .  .  . ,  m- 1, 

p  (x)  m  xp  (x)  (mod  Jrm-1).  (6) 

j  J- 1 

The  next  lemma  provides  justification  for  calling  such  a  sequence  an  m-cycle. 

Lemma  2:  Let  (vQ . vm  l)  be  an  w-cyclc.  Any  sequence  (v.,  ....  vml,  vQ,  .  .  . ,  v  ^ 

formed  by  cyclically  permuting  (vQ . v  j)  is  also  an  m-cycle.  If  d  is  a  divisor  of  m,  then  the 

subsequence  (v0 . vd  ])  is  a  d-cycle. 

Proof.  This  lemma  can  be  proved  by  manipulating  the  congruence  (6)  in  the  definition  of  an  m-cycle.  The 
congruence  can  be  iterated  to  yield 

Pt  (x)  =  x”-\  (x)  (mod  xm-l), 

Vi  o 

and  since  xm  m  1  ( mod  x  m- 1 ),  it  follows  that 
xp  (x)  a  p  (x)  (mod 

m-l  v0 

Thus  (6)  holds  between  the  first  and  last  vertices  as  well  as  between  adjacent  vertices,  implying  that  the  choice 
of  a  first  vertex  is  immaterial.  To  prove  the  second  part  of  the  lemma,  observe  that  congruence  (6)  modulo 
jrm-l  must  also  hold  modulo  its  divisor  xd-\.  □ 

Congruence  (5)  shows  that  a  necklace  of  k  vertices  is  a  /(-cycle.  Lemma  2  establishes  that  when  k  is  even, 
the  necklace  can  be  bisected  to  yield  two  ^/2-cycles. 


4.  Bisecting  the  Shuffle-Exchange  Graph 

The  concepts  developed  in  Section  3  arc  applied  in  this  section  to  construct  a  bisection  of  the  shuffle- 
exchange  graph  on  n  -  2*  vertices.  The  construction  is  obtained  by  evaluating  the  characteristic  polynomials 
of  the  vertices  at  a  complex  Kth  root  of  unity,  inducing  a  mapping  from  {0. 1}*  to  the  complex  plane.  The 
complex  plane  is  then  divided  to  induce  a  bisection  of  the  shuffle-exchange  graph.  A  corollary  of  this 
construction  is  an  asymptotic  result  on  the  number  of  necklaces. 


Let  hi  -  e7vUk  be  the  principal  primitive  complex  kth  root  of  unity,  and  consider  the  mapping  v  ►-*  pju) 
from  {0, 1}*  to  the  complex  plane.  Figure  1  graphs  the  values  of  py(u)  for  k  =  5.  The  vertices  arc  labeled 
with  py( 2).  The  solid  lines  forming  pentagons  concentric  about  the  origin  represent  shuffle  edges,  and  the 
horizontal  dotted  arcs  represent  exchange  edges. 

+  2/ 


+  1/ 


0/ 


-If 


-2/ 

-2  -1  0  +1  +2 

Figure  1:  The  shuffle-exchange  graph  on  32  -  25  vertices  mapped  to 
the  complex  plane  by  v»->  pv(«).  Vertices  are  labelled  with  py( 2). 

Dotted  lines  represent  exchange  edges,  and  solid  lines  represent 
shuffle  edges. 

Let  us  examine  this  figure  in  relation  to  Lemma  1.  The  occurrence  of  regular  fc-gons  of  shuffle  edges  can 
be  explained  by  congruence  (5).  Since  «  is  a  root  of  x*-l,  this  congruence  becomes  the  equality 
p  Jo)  -  w  py(u).  Thus  p  (u)  is  the  point  obtained  from  pv(w)  by  a  counterclockwise  rotation  of  2 it/k 
radians  about  the  origin.  The  vertices  in  a  necklace  arc  mapped  to  k  points  equally  spaced  on  a  circle  about 
the  origin,  unless  the  entire  necklace  is  mapped  to  the  origin.  The  fact  that  exchange  edges  arc  horizontal  can 
be  explained  by  equation  (4)  in  Ixmma  1.  If  vertices  v  and  e(v)  arc  incident  on  an  exchange  edge,  then  they 
are  mapped  to  complex  numbers  that  have  the  same  imaginary  part  and  differ  by  one  in  the  real  part. 
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The  bisection  of  the  shuffle-exchange  graph  will  be  achieved  by  partitioning  the  vertices  based  on  the 
imaginary  part  of  pv(w),  with  tie-breaking  when  pv(u)  is  real.  All  edges  that  cross  the  real  line  will  be 
removed,  and  it  will  be  shown  that  there  arc  at  most  0(n/k)  of  these.  This  bound  is  easily  shown  for  edges 
whose  incident  vertices  arc  not  involved  in  the  tie-breaking.  Since  there  are  n  vertices  in  the  shuffle-exchange 
graph,  there  are  at  most  n/k  regular  A-gons  of  shuffle  edges,  and  each  of  these  A-gons  crosses  the  real  line 
twice.  Since  exchange  edges  arc  horizontal,  they  never  cross  the  real  line. 

In  order  to  define  the  bisection  formally,  wc  first  partition  the  nonzero  complex  numbers  as  <C+  U  C“  where 

C+  ■=  {  z  c  <C  |  Im(z)  >  0  }  u  {  jt€IR|jc>0}, 

C~  =  {  z  €  C  |  lm(z)  <  0  }  u  {xeIR|x<0}. 

The  halves  G'  and  G"  are  defined  by  the  regions  to  which  vertices  of  the  shuffle-exchange  graph  are  mapped. 
The  vertices  for  which  pv(u)  €  C+  arc  assigned  to  V  and  those  for  which  pv(u)  €  C'  are  assigned  to  V".  The 
remaining  vertices,  those  for  which  py(u )  =  0,  are  distributed  arbitrarily  but  equally  between  V  and  V". 
Three  types  of  edges  are  placed  in  Ey 

1.  Exchange  edges  whose  incident  vertices  arc  mapped  to  real  numbers. 

2.  Shuffle  edges  whose  incident  vertices  are  mapped  to  the  origin. 

3.  Shuffle  edges  between  vertices  vand  v'  such  that  />v(w)  €  C*  and  pv/(u)  e  C'. 

It  can  be  seen  by  inspection  that  Es  is  a  superset  of  the  set  of  edges  that  connect  V  to  V".  Edges  not  in  Es  are 
allocated  to  E'  or  E"  according  as  their  incident  vertices  are  in  V  or  V" . 

To  see  that  |  V\  =  |  V" |,  consider  for  any  vertex  v  the  vertex  C(  v)  obtained  by  complementing  every  index  in 
the  vector  v.  This  relationship  can  be  restated  in  terms  of  characteristic  polynomials  as 

/>C(V)W  =*  ( *k~l  +  *k~2  +  .  .  •  +  1 )  -  pv(x). 

Because  the  sum  of  all  Ath  roots  of  unity  is  zero,  it  follows  that  pv(u)  =  -pC(v)(w).  Therefore,  the 
correspondence  v<-*C(v)  is  a  one-to-one  correspondence  between  the  vertices  mapped  to  C*  and  those 
mapped  to  C.  This  proves  that  this  partition  is  a  bisection  as  was  claimed.  The  cardinality  of  Es  is  the  width 
of  the  bisection  and  is  bounded  by  the  following  theorem. 

Theorem  3:  For  any  positive  integer  A,  there  is  a  bisection  S  of  the  shuffle-exchange  graph  on 
n  «  2k  vertices  such  that  the  width  of  S  is  at  most  6 (n/k). 


[! 

l\ 


Proof.  Let  S  be  the  bisection  described  above,  and  consider  the  three  types  of  edges  that  compose  Ey  We 
will  bound  each  of  the  three  types  by  the  quantity  2  (n/k). 

Each  of  the  type  3  edges  is  a  shuffle  edge  incident  on  vertices  mapped  to  nonzero  complex  numbers,  and 
each  such  vertex  belongs  to  a  necklace  of  exactly  A  vertices  which  arc  mapped  to  nonzero  numbers.  Since  the 
total  number  of  vertices  in  the  shuffle-exchange  graph  is  n.  there  can  be  at  most  n/k  such  necklaces.  The 
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shuffle  edges  in  each  of  these  necklaces  form  a  regular  &-gon  centered  at  the  origin,  and  thus  only  two  of  these 
edges  can  cross  the  real  line,  in  the  sense  of  having  one  incident  vertex  mapped  to  C+  and  the  other  to 

Thus  there  can  be  at  most  2  ( n/k )  type  3  edges. 

The  same  argument  can  be  used  to  bound  the  number  of  type  1  edges.  There  are  at  most  2  (n/k)  vertices 
mapped  to  nonzero  real  numbers.  Since  every  exchange  edge  whose  incident  vertices  are  mapped  to  real 
numbers  has  at  least  one  of  these  vertices  mapped  to  a  nonzero  real  number,  there  can  be  no  more  than 
2 (n/k)  type  1  edges. 

Finally,  the  number  of  type  2  edges  can  be  bounded  by  the  number  of  type  1  edges  by  observing  that  for 
each  shuffle  edge  (v,  o(v))  whose  incident  vertices  are  mapped  to  the  origin,  the  exchange  edge  (a(v),  e(«x(v))) 
is  a  type  1  edge.  □ 

We  now  pause  to  examine  an  interesting  by-product  of  these  counting  arguments,  a  result  on  the 
combinatorial  problem  of  necklace  enumeration.  A  necklace  is  a  string  of  k  pearls,  where  each  pearl  may  be 
one  of  c  colors.  Two  necklaces  arc  considered  equivalent  if  one  can  be  rotated  to  form  the  other,  but  not  if 
they  are  only  reflections.  It  is  well-known  (7]  that  the  number  of  necklaces  of  k  pearls  in  c  colors  is 

(7) 

d\k 

In  this  formula  <p(d)  is  Euler's  totient  function,  the  number  of  positive  integers  not  exceeding  d  that  are 

relatively  prime  to  d.  Although  it  appears  that  the  term  for  d=  1  in  (7)  might  dominate  the  summation,  it  is 

not  apparent  that  the  contribution  of  the  other  terms  is  insignificant.  However,  the  following  corollary  to 
Theorem  3  shows  that  this  term  is  asymptotically  dominant 

Corollary:  The  number  of  necklaces  of  {0,1 . c— 1}*  lies  between  ck/k  and 

((e+l)/(c-l))  (ck/k). 

Proof.  1 he  definitions  of  the  o  and  e  permutations  may  be  extended  to  {0. 1,  .  .  . ,  c-1}*  as  follows. 
e(Vi  bk-i  •  •  •  *i  V  “  bk- i  bk- 2  ■  •  •  b\ (&o+1  mod  c)’ 

°(bk- 1  bk- 2  '  '  '  bl  b0^  "  bk- 2  '  '  '  b\  b0  bk~f 

The  characteristic  polynomial  is  defined  as  before  (notice  that  now  pr(c)  is  the  vector  v  considered  as  a 
number  expressed  in  base  c  notation),  and  the  argument  of  Theorem  3  can  be  adapted  to  show  that  the 

function  v*-+  pv(u)  maps  at  most  2c*/(c-l)£  elements  of  {0, 1 . c-1}*  to  zero  and  that  the  remainder  lie 

in  necklaces  of  k  elements.  □ 


♦ 
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5.  Dissecting  the  Shuffle-Exchange  Graph 

In  the  previous  section,  we  presented  a  bisection  of  the  shuffle-exchange  graph  on  n  -  2*  vertices.  In  this 
section  we  will  show  that  when  k  is  even,  the  structure  of  the  halves  is  similar  to  the  structure  of  the  original 
shuffle-exchange  graph.  This  similarity  is  captured  in  the  notion  of  an  m-cyclic  subgraph  of  the  shuffle- 
exchange  graph,  and  it  is  shown  that  the  halves  arc  A/2-cyclic  subgraphs.  The  bisection  from  Theorem  3  can 
be  modified  to  bisect  m-cyclic  subgraphs  when  m  is  even.  Thus  when  it  is  a  power  of  two,  this  approach  can 
be  used  iteratively  to  construct  a  complete  dissection  of  the  shuffle-exchange  graph. 

An  m-cyclic  subgraph  is  a  subgraph  of  the  shuffle-exchange  graph  whose  vertices  arc  partitioned  into 
disjoint  m-cycles.  Vertices  not  appearing  in  these  m-cycles  are  also  allowed,  but  such  vertices  must  be 
isolated,  not  incident  on  any  edge  in  the  subgraph.  If  a  shuffle  edge  (v.  a(»))  appears  as  an  edge  of  the  m- 
cyclic  subgraph,  it  must  occur  between  adjacent  vertices  of  one  of  the  m-cycles,  and  the  exchange  edge 
(a(v),  £(a(v>)))  must  be  an  edge  of  the  m-cyclic  subgraph  as  well. 

The  reader  should  be  warned  that  m-cyclic  subgraphs  arc  nothing  more  than  a  vehicle  for  extending  the 
bisection  of  the  shuffle-exchange  graph  to  a  dissection.  The  definition  has  been  carefully  crafted  so  that  the 
proof  of  Theorem  3  will  apply  to  them  and  so  that  their  separator  exhibits  the  closure  property. 

Lemma  4:  When  k  is  even,  the  halves  G'  and  G"  produced  by  the  bisection  from  Theorem  3  are 
fc/2-cyclic  subgraphs. 

Proof.  Without  loss  of  generality,  we  show  this  for  G'  only.  The  vertices  that  are  mapped  to  zero  by 
vh  py(w)  have  no  incident  edges  (are  isolated),  but  every  other  vertex  of  G'  occurs  in  some  sequence 

(vQ . vjt/2-i>  t^iat  arosc  fr°m  cutting  a  necklace  of  k  vertices  in  half.  Since  any  necklace  of  k  vertices  is  a 

6-cycle,  and  k/2  divides  k.  Lemma  2  ensures  that  this  sequence  is  a  k/ 2-cycle.  Thus  we  have  demonstrated 
the  first  requirement  for  G'  to  be  an  A/2-cyclic  subgraph:  every  vertex  not  in  an  m-cycle  is  isolated. 

We  must  now  show  that  if  a  shuffle  edge  (v,  <j(v))  appears  as  an  edge  in  G\  then  it  occurs  between  adjacent 
vertices  of  one  of  the  m-cycles,  and  furthermore,  that  then  the  exchange  edge  (<x(v),  e(a(v)))  is  also  in  G'.  It  is 
clear  that  the  first  condition  is  satisfied.  The  second  condition  can  be  demonstrated  by  observing  that  both  v 
and  ct(v)  are  mapped  to  C+.  Since  the  point  po(y)(w)  can  be  obtained  from  pv(u)  by  a  counterclockwise 

rotation  of  2 v/k  <  m  radians  about  the  origin,  it  is  impossible  for  <r(v)  to  be  mapped  to  the  real  line.  The  set 
of  removed  edges  Es  contains  only  those  exchange  edges  whose  incident  vertices  arc  mapped  to  real  points, 
which  means  that  (o(  v),  c(o(  v)))  must  be  in  E'.  □ 

When  m  is  even,  the  bisection  from  Theorem  3  can  be  generalized  to  a  bisection  of  an  arbitrary  m-cyclic 
subgraph.  Let  um-e2w‘/m  and  consider  the  function  v>->pv(o)m).  Since  um  is  a  root  of  xm~l,  the 
congruence  (6)  between  adjacent  vertices  of  m-cycles  becomes  the  equality  pv(wm)  *  umpv  (wm).  This 
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means  that  if  any  vertex  of  an  m-cyclc  is  mapped  to  a  nonzero  complex  number,  all  the  m  vertices  of  the  tri¬ 
cycle  are  mapped  to  distinct  points  evenly  spaced  on  a  circle  about  the  origin.  Equation  (5)  applies  as  before 
to  show  that  vertices  connected  by  an  exchange  edge  are  mapped  to  complex  numbers  which  differ  by  one. 


Let  G  be  an  arbitrary  m-cyclic  subgraph  of  a  shuffle-exchange  graph  on  n  -  2k  vertices,  and  suppose  that  m 
is  even.  In  order  to  construct  a  bisection  of  G,  the  vertices  of  the  m-cycles  of  G  are  assigned  to  V'  or  V" 
according  as  they  are  mapped  by  v  i->  pv(um)  to  C+  or  C.  The  remaining  vertices  of  G  are  those  vertices  that 

are  mapped  to  the  origin  and  those  that  arc  isolated.  These  may  be  divided  arbitrarily  but  equally  between  V 
and  V".  As  with  the  bisection  from  Theorem  3,  Es  consists  of  three  types  of  edges. 

1.  Exchange  edges  whose  incident  vertices  are  mapped  to  real  numbers. 

2.  Shuffle  edges  whose  incident  vertices  are  mapped  to  the  origin. 

3.  Shuffle  edges  between  vertices  v  and  v'  such  that  />v(«m)  £  C+  and  P/(wm)  e  C*. 

The  remaining  edges  are  assigned  to  E'  or  E"  depending  on  whether  their  incident  vertices  are  in  V'  or  V". 


Unlike  before,  however,  the  correspondence  v  C(v)  cannot  be  used  to  show  that  |F'| «  \  V"j,  since  v  may 
be  a  vertex  of  G  when  C(v)  is  not.  But  because  m  is  even,  the  equality  p  (<o  )  -  -p  (u  )  holds  for 


>4- m/2 


vertices  v.  and  v.+m/2  in  the  same  w-cyclc,  and  the  correspondence  v*-+  v.+m/2  suffices  to  show  that  this 
partition  is  a  bisection.  The  following  lemma  provides  a  bound  for  the  width  of  the  bisection. 


Lemma  5:  Let  m  be  even,  and  let  G  be  an  m-cyclic  subgraph  on  i  vertices.  There  is  a  bisection  5 
that  bisects  G  into  w/2-cyclic  subgraphs  and  has  width  at  most  6 t/m. 


Proof.  Let  5  be  the  bisection  just  described.  Its  width  can  be  bounded  by  showing  that  there  are  at  most 
2 t/m  of  each  of  the  three  types  of  edges  in  E^  This  bound  holds  for  type  3  edges  because  there  can  be  at 
most  t/m  disjoint  m-cyclcs  in  Gand  no  more  than  two  type  3  edges  per  w-cycle.  Since  each  type  1  edge  has  at 
least  one  incident  vertex  mapped  to  a  nonzero  real  number,  and  there  are  at  most  two  such  vertices  per  m- 
cyclc,  the  bound  holds  for  these  edges.  Finally,  for  any  type  2  edge  (v,  <r(v)).  the  edge  (a(v),  e(a(v)))  is  a 
type  1  edge  because  G  is  an  w-cyclic  subgraph.  Thus  there  can  be  no  more  type  2  edges  than  type  1  edges, 
and  the  bound  on  the  width  of  the  bisection  is  proved.  It  should  be  remarked  here  that  the  definition  of  m- 
cyclic  subgraphs  was  specifically  constructed  in  order  to  establish  this  correspondence  between  type  1  and 
type  2  edges. 


To  prove  that  the  halves  of  the  bisection  are  w/2-cyclic  subgraphs,  observe  that  the  bisection  S  isolates 
those  vertices  that  are  in  m-cyclcs  mapped  to  the  origin,  and  splits  the  other  m-cyclcs  into  pairs  of  mf 2-cycles. 
Since  shuffle  edges  appear  only  between  adjacent  vertices  of  m-cycles,  this  adjacency  is  preserved  in  the  m/2- 
cycles.  The  only  exchange  edges  removed  by  the  bisection  arc  those  whose  incident  vertices  are  mapped  to 
real  numbers,  and  hence  the  argument  of  Lemma  4  can  be  used  to  show  that  if  (v,  o(v))  is  in  one  of  the  halves, 
then  (o(v).  e(o(v)))  is  also  in  the  half.  □ 


’  ■  v 


We  are  now  ready  to  combine  this  bisection  with  the  bisection  from  Theorem  3  into  a  dissection  of  the 
shuffle-exchange  graph  on  n  -  2*  vertices  for  the  case  when  A:  is  a  power  of  two.  Recall  from  Section  2  that  to 
dissect  this  graph,  we  need  to  find  a  class  of  subgraphs  that  has  a  separator  with  the  closure  property.  The 
next  theorem  provides  such  a  class. 

Theorem  6:  If  A:  is  a  power  of  two,  then  there  is  a  dissection  fn  of  the  shuffle-exchange  graph  on 
n  =  2*  vertices  such  that  any  bisection  in  $n  which  bisects  an  w-vertex  graph  has  width  at  most 

6n/k  if  m>n/k, 

/>)  -  (8) 

0  otherwise. 

Proof.  Let  Qn  be  the  class  of  subgraphs  consisting  of  /)  the  shuffle-exchange  graph  itself,  //)  its  A:/2'-cyclic 
subgraphs  that  have  n/2'  vertices,  for  j  - 1,  .  .  . ,  (lg  *)— 1,  and  iii)  its  subgraphs  that  have  no  edges. 
Correspondingly,  the  separator  $n  consists  of  i)  the  bisection  of  the  shuffle-exchange  graph  from  Theorem  3, 
«)  the  bisections  of  its  A:/2'-cyclic  subgraphs  from  Lemma  5,  and  iii)  arbitrary  bisections  of  the  totally 
disconnected  subgraphs.  To  see  that  the  closure  property  holds  for  5n,  we  first  observe  that  the  halves  of  the 

shuffle-exchange  graph  are  A:/2-cyclic  subgraphs  with  n/2  vertices.  For  j  <*  1 . (lg  A:)-2,  the  halves  of  the 

Ar/2 '-cyclic  subgraphs  with  n/2J  vertices  are  A:/2'+1-cyclic  subgraphs  with  n/2-'*1  vertices.  When  j  -  (lg  A:)— 1 
the  bisection  from  Lemma  5  uses  the  mapping  vt->/?y(w2)  to  bisect  2-cyclic  subgraphs.  Since  u2  -  -1,  all 
vertices  are  mapped  to  real  numbers,  and  thus  the  halves  consist  entirely  of  isolated  vertices. 

The  bisection  of  the  shuffle-exchange  graph  from  Theorem  3  has  width  6  (n/k).  For  j  - 1,  .  .  . ,  (lg  Ac)— 1, 
the  bisection  from  Lemma  5  bisects  a  £/2 '-cyclic  subgraph  of  n/2'  vertices  with  width 
6(n/2')/(A:/2')  -  6 (n/k).  The  totally  disconnected  graphs  can  be  bisected  with  zero  width.  □ 


6.  Laying  Out  the  Shuffle-Exchange  Graph 

Given  a  dissection  of  an  arbitrary  graph,  the  divide-and-conquer  technique  of  [2]  can  produce  a  VLSI 
layout  whose  area  is  related  to  the  bisection  widths  of  the  graphs  in  the  dissection.  The  VLSI  model  used  is 
that  of  [9],  and  its  important  attributes  are  that  wires  have  a  minimum  width  and  that  only  a  constant  number 
may  cross  at  a  point  In  this  section  the  results  of  Section  5  are  applied  to  produce  an  0(nJ/lg  n)  area  VLSI 
layout  for  an  n-vertex  shuffle-exchange  network. 

The  technique  of  [2]  constructs  a  layout  for  a  general  graph  G  by  first  bisecting  G  and  laying  out  the  halves 
recursively.  The  halves  are  then  placed  side-by-sidc,  and  the  edges  that  were  removed  to  bisect  G  are  routed 
between  the  halves.  The  layout  area  can  therefore  be  described  as  a  recurrence  in  the  area  of  the  halves  and 
the  area  required  to  route  the  edges  removed  by  the  bisection.  This  latter  quantity  is  a  function  of  the 
bisection  widths  in  the  dissection  of  G  because  the  length  and  width  of  the  layout  increase  by  a  constant 


amount  for  each  edge  routed. 


The  particulars  of  how  the  area  recurrence  arises  from  this  construction  are  described  more  fully  in  [2]. 
Some  solutions  to  the  recurrence  arc  also  given  in  that  paper,  but  the  bisection  width  bound  fn(m)  from 
equation  (8)  fails  to  satisfy  certain  conditions  that  are  assumed  for  those  solutions.  Therefore,  we  give  the  area 
recurrence  from  [2]  without  further  justification,  but  present  its  solution  in  detail. 

Let  An(m)  be  the  area  of  the  layout  of  an  wvertex  graph  in  the  dissection  of  Theorem  6  (thus  An(n)  is  the 
area  of  the  original  shuffle-exchange  graph).  We  express  An(m)  in  terms  of  fn(m)  from  equation  (8).  For  the 
initial  condition  of  the  area  recurrence,  An(l)  is  a  constant,  and  for  1  <  m  <  n, 

An(m)  -  [  V2/fn(;n/2)  +  fn(m)]2.  (9) 

The  recurrence  can  be  solved  by  taking  the  square  root  of  both  sides  and  then  substituting  L  (w)  for  \J  A  (m) 

.  For  1  <  m  <  n  this  yields 

L„(m)  -  VT  Ln(m/2)  +  fn(m). 

Iterating  this  recurrence  and  recalling  that  n  -  2*.  we  have 

L„(n)mfn(n)  +  VT/„(/i/2)  +  2/n(n/4)  +  .  .  . 

+  V2*-7„(2)  4-  vT  VBd)  +  VZLfl) 

<(6n/k)[l  +  VT  +  .  .  .  +  vTlg*]  +  s/7  Ln(  1)  (10) 

-(6 n/k)  [VT(lg*)4l-l)/(vT-l)]+ 

-CK  (n/k)Vk) 

-0(n/vT). 

The  reason  the  sum  of  the  powers  of  sff  goes  only  as  far  as  lg  k  in  line  (10)  is  that  fn(m)  is  zero  after  this 
point.  Since  An(n )  is  the  square  of  Ln(n),  the  area  of  the  layout  is  0(n2/k). 

This  technique  has  been  used  in  Figure  2  to  lay  out  a  shuffle-exchange  network  on  256  vertices.  Only  one 
fourth  of  the  layout  is  shown,  and  the  dissection  that  was  used  differs  slightly  from  the  one  in  Section  5. 
Instead  of  removing  exchange  edges,  the  arbitrary  divisions  among  vertices  mapped  to  zero  are  chosen  so  that 
e(v)  is  jn  the  same  component  as  v,  and  the  two  are  placed  together. 


7.  Conclusion 


! 

i 

j 

We  have  developed  an  extraordinary  amount  of  machinery  in  order  to  construct  an  0 (n7/k)  area  layout  for  j 

the  shuffle-exchange  graph  on  n  -  2*  vertices,  and  indeed,  we  have  only  been  able  to  show  this  upper  bound  i 

for  the  case  when  A:  is  a  power  of  two.  It  may  be  that  this  bound  holds  when  k  is  not  a  power  of  two,  but  we 
have  net  been  able  to  prove  this.  For  the  time  being,  the  best  general  upper  bound  seems  to  be  Thompson’s 
0(n2/VT)  bound. 

In  any  event,  a  gap  remains  between  either  of  these  upper  bounds  and  the  best  known  lower  bound  of 
Q{n2/k 2 )  which  is  also  given  by  Thompson.  This  lower  bound  is  proved  in  19]  by  showing  that  the  minimum 
bisection  width  of  the  shuffle-exchange  graph  must  be  Q(n/k)  and  that  the  area  of  any  graph  layout  must  be 
at  least  the  square  of  the  minimum  bisection  width  of  the  graph.  Theorem  3  shows  that  this  Q{n/k)  lower 
bound  for  bisection  of  the  shuffle-exchange  graph  can  be  achieved,  even  though  the  dissection  based  on  this 
bisection  does  not  achieve  the  Q(n2/k2)  lower  bound  for  layout  area.  This  is  because  the  bisection  width 
fn(m)  does  not  immediately  decrease  as  m  decreases  from  n.  It  may  be  that  an  improved  lower  bound  for  the 
layout  area  will  be  based  on  the  notion  of  a  minimum  dissection,  where  the  width  of  every  bisection  in  any 
dissection  can  be  bounded  from  below. 

On  the  other  hand,  it  may  be  that  an  0 (n2/k2)  area  layout  does  exist  for  the  shuffle-exchange  graph,  as 
does  one  for  the  cube- connected- cycles  (CCC)  network  of  Preparata  and  Vuillemin  [6],  The  CCC  is  the  graph 
that  arises  from  a  boolean  hypercube  of  d  dimensions  when  each  vertex  is  replaced  by  a  cycle  of  d  vertices. 

Many  of  the  problems  that  can  be  solved  quickly  using  the  shuffle-exchange  interconnection  can  also  be 
solved  quickly  using  the  CCC.  But  despite  the  fact  that  a  smaller  layout  is  known  for  the  CCC,  descriptions  of 
algorithms  for  the  CCC  tend  to  be  more  complicated.  The  discovery  of  an  0{n7/k2)  area  layout  for  the 
shuffle-exchange  graph  would  therefore  favor  the  shuffle-exchange  graph  as  the  network  of  choice  and  would 
allow  the  many  algorithms  already  designed  for  this  network  to  be  applied  directly  in  optimal  VLSI 
implementations.  But  until  such  a  layout  is  found — if  ever  one  is  found — the  CCC  will  continue  to  have  the 

edge. 

In  conclusion,  we  believe  that  characteristic  polynomials  provide  a  useful  way  of  viewing  the  shuffle- 
exchange  network,  and  we  believe  that  this  approach  goes  beyond  the  particular  technical  results  presented 
here.  Characteristic  polynomials  unveil  properties  of  the  shuffle-exchange  graph  that  arc  obscured  by  the 
classical  approach  of  relating  the  vertices  to  integers.  We  hope  that  the  mechanisms  we  have  developed  to 
relate  the  topology  of  a  particular  graph  to  the  algebra  of  polynomials  will  be  exploited  further. 
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